DOCUMENT RESUME 



ED 081 589 



SE 016 527 



TITLE 



INSTITUTION 
SPONS AGENCY 
REPORT NO 
PUB DATE 
NOTE 

AVAILABLE FROM 



EDRS PRICE 
DESCRIPTORS 



IDENTIFIERS 



Improving the Evaluation of Peace Corps Training 
Activities^ Volume II of the Report of Supplemental 
Activities. , 

Center for Research and Education, Denver, Colo. . 

ACTION, Washington, D. ,C. . 

PC-72-42043 

4 Jun 73 

75p. 

Center for Research and Education, 2010 East 17th 
Avenue, Denver, Colorado 80206 ($3. 00 plus 
postage) 

MF-$0.65 HC-$3.29 

*Cross Cultural Training; Evaluation; *Measurement 
Instruments; *Program Effectiveness; *Rating Scales; 
^Systems Analysis 
♦Peace Corps; Research Reports 



ABSTRACT 

The purpose of this project/study was to review the 
evaluation system presently being used to assess the effectiveness o 
Peace Corps training activities in Brazil and to modify rating 
instruments and scales in order to obtain more accuri\te measurement. 
This volume, the second of two reports resulting frc^Ti the project, 
describes the procedures and methods used, followed by the scaling 
instruments produced, complete with detailed instructions for their 
use, scoring, and accuracy computation. Evaluation scaling systems 
are developed for both training programs and training activities. . 
(For related document, see Volume I, SE 016 526. > (BL) 




U.S. DEPARTMENT OF HEALTH. 
EDUCATION & WELFARE 
NATIONAL INSTITUTE OF 
EDUCATION 
THIS DOCUMENT HAS BEEN REPRO 
DUCED EXACTLY AS RECEIVED FROM 
THE PERSON OR ORGANIZATION ORIGIN 
ATtNG IT POINTS OF VIEW OR OPINIONS 
STATED DO NOT NECESSARILY REPRE 
SENT OFFtClALNATlONAL INSTITUTE f"F 
EDUCATION POSITION OR POLICY. 



IMPROVING THE EVALUATION 



OP 



PEACE CORPS TRAINING ACTIVITIES 



Volume- II 

of the report of supplemental activities 
conducted under ACTION Contract PC-72-42043 



0 



•PERMISSION TO REPRODUCE THIS COPY- 
RIGHTED MATERIAL HAS BEEN GRANTED BY 



{£f 1973 

Center for Research and Education 
Denver# Colorado (USA) 

June 4, 1973 



Jieonard_Ka.t_< 
CRE 



TO ERIC AND ORGANIZATIONS OPERATING 
UNDER AGREEMENTS WITH THE NATIONAL IN^ 
STITUTE OF EDUCATION, FURTHER REPRO- 
DUCTION OUTSIDE THE ERIC SYSTEM RE- 
OUIRES PERMISSION 'OF THE COPYRIGHT 
OWNER ■ 



"PERMISSION TO REPRODUCE THIS COPY- 
RIGHTED MATERIAL HAS BEEN GRANTED BY 

jQfiP.ph F. Radford 
ACTION 



TO ERIC AND ORGANIZATIONS OPERATING 
UNDER AGREEMENTS WITH THE NATIONAL IN> 
STITUTE OF EDUCaFION. FURTHER REPRO- 
DUCTION OUTSIDE THE ERIC SYSTEM RE- 
QUIRES PERMISSION OF THE COPYRIGHT 
OWNER." 



9nin E 17TH AVENUE «-~* DENVER. COLORADO 80206 
CENTER FOR RESEARCH AND EDUCATIOH— CABLE. CREDEN 



This material may not be reproduced in whole or in pcirt without the written 
permission from the Center for Research and Education and ACTION. 

Copies of this publication are available for the cost of reproduction and 
handling ($3.00 plus postage) from the Center for Research and Education, 
2010 East 17th Avenue, Denver, Colorado 80206. 



ERLC 



VOLUME II 



Table of Contents 

Page 

Preface 



CmPTER I. INTRODUCTION 1 

CHAPTER II. PROCEDURES AND METHODS 

The Retranslation Method 3 

The Mixed Standard Scale ^ 

Development of Training Program Evaluation Scales 9 
Developnent of Training Activity EvcLLuation Scales 19 

CE^APTER III. PRODUCTS 

Training Program Evaluation SccLLes 25 

Training Program EvcLLuation SccLLes 29 

Scoring Keys 34 

Scoring Matrix .36 

Scoring Work Sheet 37 

Summary Sheet 38 

Error Computation Work Sheet 39 

Error Key 42 

Example of Error Computation 43 

Training Activity Evaluation Scales 46 

Training Activity Evaluation Scales 49 

Scoring Keys 52 

. Scoring Matrix 57 

Scoring Work Sheets 58 

Summcory Sheet 63 

Error Computation Work Sheets 64 

Error Key 65 

CBZ^PTER IV. CCNCLUSION 67- 



List of References 



ERLC 



Preface 

(We are repeating here the preface to Volxame I of the report of 
supplemental activities conducted under Peace Corps Brazil Training Contract 
PC-72-42043 to give you thi2 context in which to read Volume II which is con- 
tained in this document.) 

The study described here was performed during, the two-month period of 
January- February f 1973, A simple statement like this seems rather meaning- 
less apart from the full realization of the tremendous complexity of the 
stiudy and the scope of tasks involved. Tlie only other study conducted in 
Brazil with any similarity to this one was the Sao Francisco Valley Evaluation 
Project completed by Wayne Holtzman and associates of the University of Texas 
in 1966. The outcomes of that three-year effort, ccxnpared to those of the 
present two-month study , give some perspective to what we were able to accoi»- 
plish in such a short period of time. 

In addition to the constant pressiare <>f time, rhe large distances, in- 
volved and the accompanying logistical problems were the major dif ficixlties 
encountered in completing the work. Maintaining a tight discipline in the 
rigorous implementation of the study design was difficult, to say the least, 
when operating from a Colorado base, through oiir Brazil oiffice, and from 
there covering a major portion of the large expanses of Brazil, The success 
we were able to achieve is duo to the untiring efforts of a very talented 
staff and the impressive cooperation of the Peace Corps staff and Volunteers 
in Brazil. 

The project was originally designed according to three different 
tracks, or intended outcomes: 

A. The design of a system for measuring cross-cultural learning 
and change, 

B. The design of evaluation instriaments and procedures to accurately 
assess -the effectiveness of specific training activities, and 



Recommendation of improvements in assessing cross-cultural 
training needs and improvements in training by establishing 
those benchmark requirements of cross-cultural experience 
which should be incorporated into training • 

As the project got under way, it soon became apparent that Tracks A 
and C were so interrelated and dependent upon one another that they should 
be combined/ while Track B could be accomplished somewhat independently of 
the other two. Accordingly, the project consists of two components: One 
we titled "Improving Cross- Cultural Training and the Measurement of Cross- 
Cultural Learning" (Tracks A and C) and the other, "Improving the E^ra^uation 
of Peace Corps Training Activities" (Track B) . 

The first component has been written under this cover as Volume I. 
The second component has been written under separate cover as Volume II. 

The study was coordinated by Dr. Michael P. Tucker, Associate Director 
of the Center for Research and. Education. Project staff for the fir^t cori\- 
ponent included Howard A. Raik, David L. Rossiter, and Dr. Michael J. Uhes. 
Paul C. Jorgensen, a CRE/Brazil staff member, participated in the. field work 
in Brazil and during the early drafting phase in Denver. Mr. Raik completed 
a significant amount of the final drafting of the manuscript. 

Mrs. Joanna D. Carver was the project staff member primarily responsible 
for the second component. 

' Thomas Brand, Allan S. Dorsey and Guaraciem^ Rodriguez Dorsey, CRE/ 
Brazil staff members, assisted in the final prepcuration and trarslation of 
the data collection instruments and in the interviewing of Volunteers in the 
field. Jose Eduardo Barbosa provided logistic and administrative support 
from the CRE office in Belo Horizonte. 

Delano M. B. Carvalho, Edmar da Costa Marques, Jose Afoneo de Melo, 
and i^^'^alberto RibeirO collaborated by interviewing Brazilian associates of 

ii 



the Volunteers during the interview phase. Paulo Assis identified the 
Brazilian sample and completed the data collection for this group*, 

Drs. Charles Wagley, T. Lynn Smith and Maxine Mcirgolis of the Jniversity 
of Florida and Dr. Har3::y W, Hutchinson of the University of Miami aided in 
conceptualizing the theoretical basis for the Cultural Dimensions Test* 
Dr. Margolis also prepared test items for inclusion in the Factual Infor- 
mation Test. 

Dr. Etemiel Anderson, Gary Hodson and Sandy Hodson of the University of 
Northern Colorado identified the sample of Americauis with no Brazil experi- 
ence (called "Naive Americans" for the purposes of this report) ^ collected 
data from this group, and performed the statistical analyses on all the 
collected data. 

James Doxsey provided consulting assistance in the early phases, par- 
ticularly with respect to the measurement of affective behavior. Dr. 
Lawrence R. James provided consulting assistance in psychometrics and 
scaling^ and was instrumental in outlining the scaling procedures followed 
to produce Instruments for training activity evaluation. 

Associate Brazil Peace Corps Respresentatives Vitor Braga (Minas 
Gerais) , Charles Cox (Ceara) , James LaFleur (Bahia) , Cornelio Lana, 
Denis L. Lynch (Mato Grosso) , Marco Mota (Pemambuco)^ and George Van Antwerp 
(Rio Grande do Norte) and Program and Training Officer Robert Gentile pro- 
vided nominations for Volunteer samples and invaluable logistic assistance. 

We wish to thank the Volunteers interviewed in Minas Gerais^ Bahia, 
Mato Grosso, PemambucOf Rio Grande ^o Norte, and Ceara, the personnel of 
the agencies with which they are working, and their associates whom we 
Interviewed, for their patience and cooperation. 

iii 



We wish to thank also the men cind women in Minas Gerais and in Colorado 
who volunteered to be tested for the Brazilian and Naive American samples. 

We feel that the benefits derived from a project of this nature are 
extremely important to the i-mprovement of Peace Corps training. This study 
is the first opportunity of this type CRE has had in the three years since 
GQcapleting the Guidelines for Cross- Cultural Training ^ Training for cultural 
adaptation is a complicated matter, the very nature of which undergoes rapid 
change as understanding develops through experience* We hope that what we 
have accomplished will *be of use to Peace Corps trainers in pushing forward 
tlie "state of the art" and in helping Americans adapt to other' cultures in 
more effective ways. As is true of most endeavors of this sort, this study 
is just a start; we hope, this effort will provide a base for continuation, 
modification, and improvement of cross-cultural training and measurement. 



Michael F. Tucker 

Center for Research and Education 

Denver, Colorado 
March 1973 



r 



iv 



CaiAPTER I. INTRODUCTION 



The purpose of this project was to review the evaluation system pre- 
sently being used to assess the effectiveness of Peace Corps training ac- ' 
tivities in Brazil and to modify rating instruments and scales in order to 
obtain more accurate measurements. 

The present system, often referred to as the "weekly" or "running" 
evaluation of training, was instituted , in Brazil training programs by the 
Peace Corps in order to provide a consistent flow of information regarding 
the progress of training operations. It consists of three (juesticns which 
ask trainees to respond, according to a seven-point scale, as to the effec- 
tiveness of various training activities. The (questions have to do with 
(1) how well a particular training activity was conducted j, (2) how important 
and significant the content of the activity was, and (3) how relevant the 
learning achieved was to the future Volunteer job. 

The primary weaknesses in this system can be listed as follows: 

o The three questions do not' represent a good composite of training 
activity assessment. Such things as the objectives of the ac- 
tivity aiid quality of materials used are not included. 

o The three questions are not unidimensional. In fact, they 

probably represent seven or more dimensions. The first (question 
aisks for a single response to how well the activity was conducted; 
how good the treiner was in conducting the activity; and how good 
the method of training was. The second question asks for a single 
response to how important and how significant the content of the 
activity was. The third question asks for a single response to 
how relevant the learning achieved was to the future Volunteer 
job,, and also calls for a judgment of whether something was 



ERIC 



learned that changed behavior, would affect future performance 
on the job, and would be applied, on the job. 

' o These three questions were developed and written by peace Corps 
and training staff professionals, so that their intent, meaniijg, 
importance, relevance and "face validity" is not always clear 
to trainees • 

o The seven-point rating scales were not well developed and are 
not psychometrically sound • No reliability or validity data 
exists for these scales, -but- several indications of imprecision 
cuce apparent. Each point on the scale is not psychologically 
anchored, i,e,, there is no provision for assessing consistency 
in meaning among trainees for giving a "3" rating versus a "5" 
(for example) on any given judgment • The respondent simply 
"rates toward 7" or "rates toward 1" for each question. 

Data resulting from these scales suffer to a great extent from 
the two most common errors of rating— the halo effect and 
leniency. The halo effect means that a given training activity 
is rated the same across the three different questions due to 
a generalized feeling about the activity rather than the in- 
tended, three-criterion discriminatory judgments • Leniency 
errors result in rating distributions that are inaccurately 
high and are therefore skewed and have small ranges, i,e., the 
mean for a seven-point scale should generally be 4.0 with a 
normal distribution, but the means for these scales are almost 
always much higher with skewed distributions. 

This project was undertaken to develop scales that would overcome these 
weaknesses and result in an evaluation method that would accurately reflect 
the quality of training activities. The remainder of this report will des- 
cribe the procedures and methods used, followed by the scaling instrujnents 
produced, complete with detailed instructions for their use, scoring, and 
accuracy computation. 



CHAPTER II. PROCEDURES AND METHODS 

The first task was to review the literature on evaluation strategies, 
psychometric theory, and scaling methods (see the list of references on 
Page 69) with the objective of selecting an approach or a combination of 
approaches that would overcome each of the deficiencies in the present assess- 
ment procedure. Two methods emerged that appeared especially promising, 
both of which had been developed for use in rating personnel performance. 
The first was developed over ten years ago (Smith and Kendall, 1963) and is 
called the Retranslation of Expectations rngthod of constructing unambiguous 
anchors for rating scales. This method was selected for the purpose of con- 
structing properly worded statements for the scales. The second method is 
a recent innovai:ion (Blanz and Ghiselli, 1972) for constructing ccalcc, 
called the Mixed Standard Scale. As far as we know, neither of these methods 
has yet iDeen used to evaluate training activities, and the two have not yet 
been combined into a single system. A combination of the two methods was 
decided upon as having great potential for our purposes. 

The Retranslation Method 

The Retranslation Method was developed by Smith and Kendall in an 
attempt to construct rating statements that are clear, meaningful, and use- 
ful to those who actually have to complete the ratings. They observed that 
most rating procedures are constructed by professionals (usually psychologists) 
smd that the meaning of resulting items usually is not interpreted consist- 
ently among those who are expected to make the ratings. They reasoned that 
statements for rating scales written in .the language of the raters themselves 



4 

would be less ambiguous and more representative of actual situations than 
statements developed by psychologists. The important feature of the method, 
therefore/ is the pcirticipation of the rater population in deciding what 
items are important for inclusion in the rating scale and how the rating 
statements should be written* This method appeared well suited for con- 
structing scales to be used by trainees in assassing peace Corps training 
programs, as a common complaint has been that trainees don*t always consi<^er 
the ratings they are asked to make as being very important and there has been 
confusion among them as to the meaning of scales constructed by- the training 
staff. 

The Mixed Standctrd Scale 

The Retranslation Method, as modified for purposes of this study, re- 
sults only in the production of statp.ments for renting scales. Coiidbructiun 
of the scales, or cirranging judgments along a quantifiable continuuiii, is the 
second pcirt of the problem. The Mixed Standard Scale was first used by 
Blanz in B*inland, euid later developed by Blan2; and Ghiselli for use in the 
United States. It was developed in order to minimize the common rating 
errors of halo and leniency (described previously V as well as to permit an 
evaluation of the reliability with which each thing is rated, each scale 
rates, and each rater rates. Reliability ordinarily is determined for the 
entire process of ratings, which includes the scale, the thing being rated, 
and the rater:. The Mixed Standard scaling procedure makes it possible to 
differentiate between the accuracy of each, and to inquire into the relia- 
bility with which different things are rated, the reliability with which 
different scales measure, and the reliability of the ratings assigned by 
different raters. 



5 

With most rating procedures the rater is presented with different 
degrees of "goodness" for each of a number of separate criteria (e.g., trainer 
skill, clarity of objective) pertaining to a training program, and he selects 
the one which best describes the program or activity. In the Mixed Standard 
rating scale there cire descriptions of three degrees of each criteria to be 
rated, and the rater must: respond to every description independently* He 
indicates whether he considers the program or activity to be better than 
the description, to fit the description, or to be worse than the description- 

To reduce the possibility that a rater will form a clear picture of 
an order of merit set of descriptions for each criterion being rated, the 
scales and the order of the three statements in them are mixed in a random 
order. Thus the rater does not deal with all of the statements related to 
one and the same* criterion simultaneously, for he has to rate with respect 
to each c,iven s ■. atement separately. The rater fills out one rating form 
for each training prograin or separate activity to be rated. Once the form 
has been completed, the answers may be recirranged into the form of the 
commonly employed rating scale so that the questions and answers for each 
criterion follow one another in order of superiority. By this means, all 
of the ratings . given by the rater on any particular criterion can be viewed 
simultanecasly . By contrast, the rater himself could not do this and could 
not make a choice among them. 

There cire two purposes for this? procedure. First, the mixing reduces 
the possibility that the rater will be able to form a clear picture of any 
order of merit set of descriptions for each criterion being rated. Thus it 
is anticipated that the errors of halo and leniency will be reduced. Secondly, 
the mixing provides a means for examining the dependability and reliability 



of the ratings, for it permits the ratings to be examined in light of the 
consistency or logic of the answers to the different statements relating to 
the same criterion (Blanz and Ghiselli, 1912, pp, 186-187). 

The method can be illustrated by means of the following example • Let 
us assume that the criterion to be rated is "the quality of the training 
materials used in a particulcir activity" and has on a scale the following 
three statements^ I being the best description and III the poorest: 

I. The materials used in this activity were well prepared and very 
relevant to the purpose of the activity, 

il. The materials used in this activity were adequate and seemed 
moderately relevant to the purpose of the activity • 

III. The materials used in this activity were not well prepared and 
seemed irrelevant to the purpose of the activity • 

If a rater utilizes the procedure accurately then whenever he checks 
one statement in a scale as "fits or matches the training activity under 
consideration" (0), all statements in that scale which describe the activity 
as inferior will be checked as "the activity was better than the statement" 
(+) . If all three statements cire checked with a (+), it means that in the 
rater's opinion the activity is very good in this criterion, for the activity 
was superior even' to the very best of the three descriptions. Similarly, 
if all three statements in a scale are checked with a (-), it means that in 
the rater's opinion the activity was very poor, for the activity was inferior 
even to the very poorest of the three descriptions. 

With the three graded statements us .3 in this manner, there is actually 
a seven-point scale on each criterion, which also is an improvement on ordi- 
nary rating scales. Piirsuant to the logic of the system, the various combi- 



7 

nations of faultless responses to the items can be arranged as follows and 
can be assigned the number of points indicated: 



Table 1 
Descriptive Statement Scale 

Descriptive Statements Points 

7 
6 
5 
4 
3 
2 
1 

+ The ratee is better than the" statement. 

0 - The statement fits the ratee • 

- The ratee is v/orse than the statement. 



The foregoing combinations cire faultless because there are no reversals 
in the ordei' with which the three graded descriptions are checked. That is, 
whenever a statement is checked with a (0) , no statement which describes 
better performance is checked either (0) or (+) and no statement which des- 
cribes inferior performance is checked either (0) or (-). Furthermore, (0), 
which means the statement fits the activity, is not employed for two or more 
statements which describe degrees of the criterion. All combinations of 
responses to the three statements, other than the seven given above, are 
illogical and inconsistent and therefore in error. Nevertheless, the logic 
of the system permits such scales to be scored. The scoring system for in- 
consistent responses, that is, where the ratings are in error, are given 
on the following page. 

ERLC 



II 
+ 
+ 

0 



III 

+ 
+ 

+ 
0 



8 



Table 2 

Points to be Assigned for Combinations of Responses 
Which are Not Logical ^ and Therefore are in Error 



Combination 



Points 



T 


TT 


XTT 

J. J.i. 








0 


7 


+ 






7 


0 




0 


6 


0 


+ 




6 




+ 


0 


5 




+ 




5 


0 




+ 


5 


0 


0 




4 


+ 


0 


+ 


4 


+ 


0 




4 


0 


0 


0 


4 




0 




3 


+ 




+ 


3 




0 


0 


3 


0 




0 


2 


+ 




0 


2 


+ 






1 


0 






1 



+ = Ratee is better than this statement - 

0 = Statement fits the ratee. 

- = Ratee is worse than this statement. 



The determination of the consistency of the ratings, i.e., the number 
of errors, in fact amounts to a scalogram analysis (Edwards, 1957 and 
Torgersonr 1958) • A variety of different sorts of error counts can be made 
depending upon the type of accuracy with which one is concerned. Counts 
can be made of the number of errors per activity, the number of errors per 
scale, and the number of errors per rater (Blanz and Ghiselli, 1972, 
pp. 187-188). 



9 

The exact nature of these two methods, in terms of our modification 
and combi.iiation of them, is best described according to the steps followed 
in this project and the scales that resulted from our work. For reasons 
described later, it was necessary to develop two separate evaluation scales: 
one for use in assessing the overall training program according to critical 
criteria defined by trainees, and the other for use in assessing the effec- 
tiveness of specific training activities according to criteria defined by 
professional trainers. 

Development of Training Program Evaluation Scales 

Step 1. Identification of Training Assessment Criteria 
The first step in developing scales for use by trainees in assessing 
the ef f ectivene'ss of training according to the Retranslation Method would 
have been to obtain frOT trainees statements describing L-nport-ant assessirlent 
criteria. Since there were no trainees engaged in training at the time of 
this study, this was not possible. It was decided, therefore, to sample 
Volunteers who had participated at different times in several different 
training programs. Volunteers located in Natal, Rio Grande do Norte; 
Salvador r Bahia; and Rio de Janeiro were identified for pcirticipation in 
developing the scales. These Volunteers had experienced several different 
training programs and had been out of training anywhere from one week to 
one and one-half years. 

Nineteen Volunteers were interviewed in Natal for purposes of iden- 
tifying assessment criteria. Information was obtained in small discussion 
groups as they responded to the following questions: 

o What are the important dimensions of training? 

o What were the things that influenced your learning? 



10 

o What good things happened in training that helped you adapt to 
Brazil? 

o Can you recall the things that hindered your learning? 

o What training experiences seemed most important to you? 
The Volunteers were asked to couch their responses in -terms of use for evalu- 
ation purposes. A large amount of information was thus elicited and docu- 
mented for later use- 

Step 2. Drafting Criterion Statements 

The information gathered in Step 1 was organized into a series of 
statements, each representing a separate ideci generated, by the Volunteers. 

Step 3. Checki^ig the Accuracy of Criterion Statements 
The Volunteers who had participated in the original interviews were 
interviewed a <=:Rconr3 ftme^ They were asked to examine the draft statements 
for accuracy in reflecting their views, to make changes where necessciry, and 
to expand the list if important things were left out. 

Step 4. Cross-checking the Accuracy of Criterion Statements 
In order to generalize and provide a cross-validation for the criterion 
statements, a second group of Volunteers was interviewed in Salvador and a 
third group in Rio de Janeiro. They were asked to examine the statements 
for accuracy in reflecting their views about training, to make necessary 
modifications, and to expand the list. In addition to these three groups 
of Volunteers, a number of peace Corps/Brazil staff memhers, as well as 
training staff personnel, were consulted for further checking the accuracy 
and clarity of the criterion statements. 



11 

step 5. Ranking the Criterion Statements in Order of Importance 
A total of twenty-four Volxanteers in Natal and Salvador were asked to 
rank the fifty-eight statements in order of importance. I;ach Volianteer 
worked alone and recorded his choices on an individual record card. A list 
of these statements is presented below in the resulting rank order- along 
with the rank vauLue. The weighted rank value was computed by ad<2ing the 
total number of ranks given to each statement and dividing this total by the 
number of Volunteers providing the ranks. 



Rank 

Rank Value Statement 

(I) 79 Trainers have professional competence in the participative 

and experiential methods of training. 

( 2) 73 Training staff is together, well-coordinated, taik-oriented 

toward helping trainee. 

(3) 73 Learning climate is psychologically free from rigid inter- 

actions and conducive to motivated learning in a flexible 
set+'ing. 

{ 4) 70 Language teaching method has sufficient variety of techniques 

to consistently motivate language learning. 

( 5) 68 Learning climate is physically good and conducive to serious 

study . 

(6) 68 Job information specific and accurate. 

(7) 68 S>:;aff is oriented toward helping and supporting each trainee. 
( 8) 67 Director is easy to talk to, and approach, outside of class. 

(9) 67 Training prograiri is primarily experiential, emphasizing, 

from the first weeki learning through experiences outside 
the center. 

(10) 65 Language staff easy to approach and talk to outside of class. 

(II) 63 Language and cross-cultural training are coordinated through 

well chosen training center and off-center experiences. 



ERIC 



Criticisms of program easily and pleasantly accepted by 
language staff. 

Flexible pickup on trainee suggestions. 

Trainees are offered adequate opportunities to sample off- 
center cultural activities • 

Allocation of trainee time is compatible with trainee energy 
levels and leisure time needs. 

Trainer is a "facilitator/' on equal social status with 
trainee, encouraging trainee to manage his own training plan. 

Cross-cultural training emphasizes "openess" and approaches 
that will serve as guidelines. 

The training prograjn appears relevant to the Brazil Peace 
Corps Volunteer program realities. 

Technical orientation has realistic, job-centered, objectives 

Brazilian and American staff both follow the same training 
philosophy. 

The Director pcirticipates fully in the training program, 
making himself easily accessible to trainees. 

Supplemental language activities and materials stimulating 
to trainees, motivating language learning. 

Trainees lecirn "social expectations," etiquette, common 
mannerisms and behaviors. 

Trainee feels he is trusted to learn at his own pace in 
language training. 

Each trainee has a staff member to chat with, someone with 
whom to share misgivings, doubts, and anger about Peace Corps 
and training, without fear of defensiveness or reprisal. 

Trainees participate in the planning of the weekly activities 

Trainee suggestions immediately discussed and acted upon. 

Peace Corps Volunteer job is clearly defined. 

Staff includes trainees in planning sessions for the next 
week. 

Defensiveness of key staff members hinders learning, destroys 
free give and take of the learning climate. 



13 

Specialists receive special career consideration in scope 
of Peace Corps job description. 

Trelnee helps to firm up his own job description through 
site visits and Peace Corps programmer help. 

Trainee experience in pensao is helpful to the goal of ada],^- 
tation to Brazil. 

Trainee feels he is trusted to learn at his o\m pace. 

Trainee manages his activities toward approaching his site 
and re-negotiating his job description. 

Defensiveness of key staff members discourages suggestions 
from trainees, limits participation. 

Trainees trust the trainers to competently lead them into 
adaptation to Brazilian life. 

Training staff facilitate te^ainee relationships with Peace 
Corps Brazil staff by helping to build rapport during 
training. 

Health referrals expedited for those trainees who need pro- 
fessional care. 

After one month/ trainees visit site and rewrite enroute 
and terminal objectives. 

The estaqio experience hastened adaptation to Brazilian 
life". 

Library resources and handouts adequate for learning needs. 

The est agio experience improved communication skills with 
Brazilians. 

Trainer follows Peace Corps programmer lead for first site 
visit and verification of job description. 

Trainees assigned to homes during first two weeks. 

Trainees help to write their own nucleos for language study. 

Language ratings on interim objectives are good learning 
devices. 

Trainees help. to write critical incidents on basis of off- 
center experience. 



14 

(49) 40 Trainee's allowance is adequate for his needs, 

(50) 40 Women's adaptation needs are addressed specif ically, 

(51) 38 Critical incidents hit at major issues of adaptation, 

(52) 37 Trainees may invite Brazilicins to frequent parties — where 

trainers also attend, 

(53) 33 Terminal objectives are being accomplished in the ten weeks 

allotted, 

(54) 32 Trainee gives effective report of his estagio , 

(55) 31 Spouses treated as adults and full Peace Corps members. 

(56) 30 Enroute objectives are being met on schedule, 

(57) 29 The training design is being implemented according to its 

original plan. 

(58) 27 Trainee learns to write a business letter, use banking forms. 

This rank distribution was then examined, and the top twenty items were 
selected for further development (ranked items 1 through 20), 

Step 6, Construction of Criterion Statement Categories 

Tha twenty statements selected in St:ep 5 were studied to determine the 

dimensions of training assessment criteria they represented. This resulted 

in the construction of the following fourteen categories: 

1, Training staff expertise in applying Peace Corps methodology 

2, Training staff team performance 

3, Training staff availability to trainees 

4, Training program director availability and responsiveness 

5, Experiential learning based on host community environment 

6, Training staff responsiveness to trainee suggestions 

7, Cross-cultural training method 

8, Language training method 

9, Coordination of resources of individual needs 
10. Realistic job-centered objectives 



15 

11. Accurate job descriptions 

12. Opportunities to sample of f- training-center Brazilian cultural 
activities 

13. Physically adequate learning climate 

14. Training schedule 

It was decided that Categories 1-9 (labeled A through I in our scales) would 
be developed into rating scales according to the Mixed Standard scaling 
method, while 10-14 would be included as "yes" or "no" questions • 

Step 7> Drafting Degrees of Effectiveness for Criterion Categories 
Each of the first nine categor'i.'s constructed in Step 6 was studied 
separately for the purpose of developing rating items. A set of three state- 
ments was written for each ceitegory, the first representing high effective- 
ness, the second representing mediiam effectiveness, and the third repre- 
senting low ef fectiveness. 

Step 8, Checking for Accurate Inclusion of Effectiveness Statements 
in Original Categories 

A group of eleven judges was selected to determine whether the sets 
of effectiveness statements were perceived as clearly belonging to the 
original categories (an important consideration in determining scale errors 
in the Mixed Standard scaling procedure). The twenty-seven effectiveness 
statements were arranged in random order on cards, and the nine categories 
were written on cards and placed side by side. The judges were asked to 
place each statement in the category they thought it clearly represented. 
The results of the judging indicated a high accuracy in the statements being 
placed in their original categories. There were some errors made in Cate- 
gories A, C, F, and I, however, so the ambiguous statements causing these 
errors were modified. 



16 

Step 9, Checking for Accurate 'Degrees of. Effectiveness for Criterion 
Categories 

The same eleven judges who checked for accurate inclusion of effec-* 
tiveness statements in original categories were used to check the accviracy 
of the draft statements indicating degrees of effectiveness for each of the 
categories. This was done by arranging the twenty-seven statements in ran- 
dom order and having the judges place each set of three statements in their 
original categories as indicated in Step 8, When each set of three state- 
ments was placed in its proper category/ the judges were asked to order the 
three statements from high to low accordi.ng to their perception of the degree 
of effectiveness represented. Again the results of this judging indicated 
a high degree of accuracy in the draft stateirients being written in the proper 
order of high, medium, and low effectiveness. There were some errors made 

-»^* w w J 1 - . f - / ~ / — / -J — . - ~- ^ ^ ; ^ 'w... »^ ^ ^ ^ « I ^ —iiCJC'C; 

were rewritten. 

The twenty-seven statements arranged in their proper order of effec- 
tiveness and in the proper categories, resulting from the procedures described 
in Steps 1-9, are listed below: 

Category A. Training Staff Expertise in Applying Peace Corps Training 
Methodology 

I. The majority of training staff competently apply appro- 
priate Peace Corps tra.ining methodology, showing kindness 
and consistency in the way they deal with trainees. 

II. About half of the training staff competently apply appro- 
priate Peace Corps training methodology, showing kindness 
and consistency in the way they deal with trainees. 

III. Only a few trainers competently apply appropriate Peace 

Corps training methodology, showing kindess and consistency 
in the way they deal with trainees. 



17 



C atecfory B, Training Staff Team Performance 

I. The staff appear to have a good team approach toward 
conflict resolution and building a favorable learning 
climate of opan interaction among themselves and with 
trair.ee s. 

II. Th,i: staff appear to have a divided team approach, some 
incibility in resolving conflict, and. fair success in 
building a learning climate of open interaction among 
themselves and with trainees. 

III. Tb.e staff appear to lack a team approach, affect the 
whole center with their conflict, and/or segment the 
learning clinate according to the philosophy of each 
trainer. 

Cat egory C. Training Staff Availability to Trainees 

I. Most training staff seek extra time opportunities for 
talking with individual trainees. 

II. Most training staff spend extra time talking with indi- 
vidual trainees. 

III. Most training stalf avoid spending extra time with indi- 
vidual trainees. 

Category D. Training program Director Availability and Responsiveness 

I. The Director is easily approachable, and often partici- 
pates in informal group and individual discussions with 
trainees. 

II. The Director is sort of approachable, and occasionally 

participates in informal group and individual discussions 
with trainees. 

III. The Director is difficult to approach and rarely par- 
ticipates in informal group or individual discussions 
with trainees. 



Category E. Experiential Learning Based on Host Community Environment 

I. Guided learning activities in the community are scheduled 
as often as possible, and are well integrated into the 
total learning program. 

II. Learning activities in the community are occasionally 

scho' d,; these activities are usually integrated into 
the +:ou ' J earning program, but sometimes suffer from 
lack of ijtaff guidance. 



Ill, Learning activities in the community are rarely scheduled. 
These activities suffer from lack of staff guidcmce and 
are poorly integrated into the training program. 

Category F, .^Traiij'ing Staff Responsiveness to Trainee Suggestions 

I. Traii>;^:ig staff seek feedback from trainees and always 
deal with suggestions and criticisms immediately to 
mutually find the best solution. 

II. Training staff* seek feedback from trainees and usually 
deal with suggestions axid criticisms, but seldom take 
immediate action to work out changes or solutions. 

III. Training staff avoid feedback from trainees and rcirely 

deal with suggestions and criticisms in such a way as to 
make changes or find solutions. 

Category G. Cross-cultural Training Method 

X. Cross-cultural training emphasizes a variety of alter- 
native behaviors that are appropriate to specific situ- 
ations, utilizing the larger theoretical and cultural 
context for greater understanding* 

II. Crosrr-cultural training includes sorno va::iety of alter- 
native behaviors that are appropriate to specific situ- 
ations, and utilizes the historical coiitext for greater 
understanding. 

III. Cross-cultural training is restricted to prescriptions 

of stereotyped behavior that are appropriate or inappro- 
priate to the Brazilian culture. 



Category H. " Language Training Method 

I.^ Language trainers use a great variety of teaching tech- 
niques that consistently contribute to individual moti- 
vation for language learning. 

II. Language trainers use some variety in teaching techniques^ 
but not sufficiently to consistently contribute to indi- 
vidual motivation for language learning. 

III. Language trainers use little variety in teaching tech- 
niques, which contributes to loss of motivation for lan- 
guage learning. 

Category I. Coordination of Resources for Individual Needs 



Most training staff consistently coordinate their training 
activities and resources to address the needs of an indi- 
vidual trainee. 



19 



II. Some training staff coordinate their training activities 
and resources to address the nec^ds of an individual 
trainee. 

Ill* Most training staff tend to view trainees as a group and 
rarely coordinate their training activities and resoxirces 
to address the needs of an individual trainee. 

Step 10, Application of the Mixed Standard Scaling Method 
The final step in the procedure was to apply the Mixed Standard Scaling 
Method to the categories and statements x*e suiting from previous steps. The 
statements were arranged in random order and incorporated into the rating 

•J 

scale, with proper instructions, scoring procedures, and methods for checking 
reliability, all of which is presented in Chapter III of this report. 

Development of Training Activity Evaluation Scales 

As stated earlier, this project was initiated in order to develop 
scales for use in evaluating the effectiveness of training activities. The 
procedures described in the ten steps above were followed to achieve this 
objective. However, an examination of the nine criterion categories and 
the twenty-seven effectiveness statements that resulted from this procedure 
clearly indicates that they are not suitable for purposes of assessing 
specific training activities. These criteria are much broader in nature, 
and have to do with the effectiveness of the training experience as a whole* 
They represent what trainees think cire the most important aspects of Peace 
Coirps training in general, not elements of successful training activities, 
exercises, or lessons. (It was significant, and somewhat surprising, to 
discover the overvheJjning importance trainees placed on the training staff, 
six of the fourteen criterion categories were in direct reference to the 
training staff O 



20 

It was decided, therefore, that these items would be retained for use 
in evaluating training programs in general ~ perhaps at the middle and again 
at the conclusion of training — and that a second set of criteria would be 
developed to assess training activities. It was, also decided that profes- 
sional trainers, rather than trainees, would be used to develop these state- 
ments as trainers have a better understanding of the technical elements that 
combine to characterize a successful training activity. 

The five steps involved in this procedure are outlined below: 

Step 1, Identification cf Training Activity Assessment Criteria 
Five professional Peace Corps trainers were brought together to identify 
criteria of effective training activities. They were asked to brainstorm all 
the elements of any given training activity (e.g., language class, case study 
exercises / field e>nperier:ce) that contributed to its effectiveness or inef- 
fectiveness.' Each trainer was then asked to write down the five elements 
he thought were most important. A final set of five criteria was selected 
by examining the combined lists of the five trainers. These five criteria 
were: 

A. Clarity of the objective 

f 

B. Skill of the trainer 

C. Effectiveness of the method 

D. Quality of materials 

E. Subjective estimate of leciming achieved related to Vol\inteer 
service 

Step 2. Drafting Degrees of Effectiveness for Criterion Categories 
Each trainer was then asked to write a set of three effectiveness 
statements for each of the five criterion categories resulting from Step 1. 



21 

The set was to represent three degrees of effectiveness for each category — 
high, medium, and low. These statements were used to produce a final set 
of fifteen statements (a set of three for each of the five criterion cate- 
gories) • . 

Step 3. Checking for Accurate Inclusion of Effectiveness Statements 
in Criterion Categories 

The same eleven judges who participated in developing the Training 
Program Evaluation Statements were employed to determine whether the sets of 
effectiveness statements were clecorly perceived as belonging to the criterion 
categories for which they were written. The fifteen effectiveness statements 
were arranged in random order on cards, and the five categories were written 
on cards and placed side by side. The judges were asked to place each effec- 
tiveness statement in the category they thought it clearly represented. The 
results of this' judging indicated a high accuracy in the statements being 
placed in their original categories. One of the eleven judges indicated some 
difficulty distinguishing between Category B and Category C. He felt that 
the method can only be as good as the trainer. However, most of the judges 
had no difficulty; the items seemed to fit easily into the categories. 

Step 4. -CJhecking for Accurate Degrees of Effectiveness Statements in 
Criterion Categories 

The eleven judges were then asked to participate in checking the 

accuracy with which the fifteen statements represented high, medium, and 

low degrees of effectiveness in each category. After the statements had 

been placed in their proper category (in Step 3), the judges were asked to 

corrange them in order of effectiveness represented from high to medium to 

low. The results of this judging indicated a high degree of accuracy. Ten 



22 



of the eleven judges placed the statements correctly in the effectiveness 
sequence for which they were written. 

The fifteen statements resulting from this procedure, in the proper 
categories and order of merit sequence / are listed below. 

Category A. Clarity of Objective 

1. I clearly understood the objective of this activity. 

II. I think I understood the objective of this activity, but 
it's not too clear. ^ — — 

III. I did not understand the objective of this activity. 

Categoiy B, , Skill of the Trainer 

I. The trainer was very skillful in conducting this activity. 
The effective use of these skills greatly facilitated my 
learning. 

II. The trainer conducted this activity fairly well, but could 
have n,sed more skill in helping me learn. 

III. The trainer was not skillful in conducting this activity, 
and did not help me learn. 

Categoiy C. Effectiveness of Method 

I. The method used in implementing this activity was very 
effective in facilitating my learning. 

II. The method used in implementing this activity was all 

right, but it did not particularly facilitate my learning. 

III. The method used in implementing this activity did not 
facilitate my learning. 

Category D. Quality of Materials 

I. The materials used in this activity were well prepared 
' and very relevcint to the purpose of this activity. 



•II. The materials used in this activity were adequate and 

seemed moderately relevant to the p\irpose of the activity. 

III. The materials used in this activity were not well prepared 
and seemed irrelevant to the purpose of the activity. 



23^^ 

Category F> Siabjective Estimate of Learning Achieved Related to 
Volunteer Service 

I- I learned a great deal from this activity which I feel has 
helped me prepare for Volunteer service, 

II. I learned a moderate amount from this activity and some 
of what I learned has helped me prepare for Volunteer 
service. 

Ill, I learned very little from this activity, and I don't 
think it has helped me prepare for Volunteer service. 

Step 5. Application pf the Mixed Standard Scaling Method 

Tha final step in the procedure was to apply the Mixed Standard Scaling 

Method "to the categories and statements resulting from the first four steps. 

The fifteen statements were arranged in random order and incorporated into 
the rating scale with proper instructions, scoring procedures, and methods 

for checicing reliability, all of which is presented in Chapter III of this 
report. 



CHAPTER III. PRODUCTS 

The Training Program Evaluation Scales 

The complete system for evaluating training programs appears at the 
end of this section, beinning on Page 29. The scales themselves should be 
administered exactly as they appecir and as described in the instructions. 
It is recommended that these scales be used once or twice during the length 
of a training program — about midway through the program and again near the 
program's conclusion. It is ijnportant that the resulting evaluation infor- 
mation be immediately shared among the training community for program modi- 
fication and improvement, as well as the data being systematically analyzed 
and stored in a central location (Peace Corps/Brazil or Peace Corps/Washington) 

Scoring 

(A look at the evaluation scales, beginning on Page 29, at this point 
will probably make the following description easier to understand.) 

The last five items in the scales are straightforwcird and simple to 
score. The total number of "yes," "a little" and "no" responses are simply 
examined and tabulated for each of the five items. 

The other twenty-seven statements are more complicated and require a 
somewhat elaborate scoring procedure. These statements were eirranged in • 
random order so that the rater could not easily determine which statements 
belonged to a particular category or which statement fit in an order-of- 
effectiveness sequence. These statements must be rearranged in the proper 
sequence and category for scoring purposes. This is done by examining the 
Scoring Keys on Page 34 and the Scoring Matrix on Page 36. The items be- 
longing to each category, according to the Key, are examined on each response 
sheet, and the scale value is found on the scoring Matrix. 



26 

For example, for Criterion Category A - Training Staff Expertise in 
Applying Peace Corps Training Methodology - items #9, #27/ and #18 are 
examined in that sequence. If a given response set to these three items is 
#9 = +^ #27 = +/ #18 = the resulting score is 7- If another response 
set to these items is #9 = #27 = #18 = the resulting score is 3, 
and so on for all possible response combinations as indicated in the Scoring 
Matrix. 

Each resulting scale value should be listed for all respondents on the 
Scoring Work Sheet on Page 37. When all of the nine scales have been scored 
for all of the respondents, the scale sums and scale means should be com- 
puted as indicated at the bottom of the Work Sheet. Each of these nine scale 
means should then be listed on the Summary Sheet on Page 38, along with the 
frequency tabulations on the last five questions and a summary of comments 
and suggestions. Copies of the Siammary Sheet should be distributed to all 
staff and traineies for feedback and discussion. This Sximmary Sheet should 
also be used for purposes of program evaluation record-keeping. 

Error Computation 

One of the strong assets of the Mixed Standard Scaling Method is that 
it is possible to, keep an immediate and continuous accounting of the accuracy 
of the rating procedure. Each time the scales are used, it is possible to 
compute the rater errors as well as the individual scale errors. This is 
done by using the Error Computation Work Sh;5et on Page 39. The responses 
on each of the nine scales are listed, using th;6 Scoring Keys and the Scoring 
Matrix. Then, the errors cire listed for each set of ratings in each scale 
by referring to the Error Key appearing on Page 42. Any set of ratings 
other than the seven appearing in the Error Key cire in error according to the 



logic of the system. Each error is noted and listed in the appropriate space 
on the Work Sheet. The errors for each rater are computed by summing the 
errors made by each rater across the nine scales. The average rater error 
is computed by summing each rater's error and dividing by the niimber of 
raters. A rater error percentage can be computed by dividing the average 
rater error by 9. 

The errors made in using each of the nine scales can be computed by 
summing the total number of errors made by the respondents on each scale. 
A scale error percentage can be computed by dividing the sum of each scale 
error by the number of respondents. 

A ccmplete example of how the error computation system works is pre- 
sented on Page 43. There are several ways that this information can be 
interpreted and used. Referring to the example, these are: 

o Scale 3 was the most reliable, since there were no errors made 
in using this scale. On the other hand, Scale 1 was the most 
unreliable, showing a 50% error rate (half of the respondents 
made an error in using this scale). The other seven scales 
fall hets^een these two extremes. In general, the smaller the 
error, the more confidence there is for decision making in using 
the results of ratings made on the scale. 

o Raters #10 and #20 were the most reliable, since they made no 
errors in using all nine scales. Raters #5, #8, and #13 were 
the most unreliable, since they made errors in four out of the 
nine scales. The scale data could be made more reliable by not 
using the ratings made by the most unreliable raters. 

o The overcill rater error was 26%, or conversely, the overall rater 
accurcxcy was 74%. This figure provides a general idea of the 
accuracy of the evaluating system. No method has yet been 
developed (of which we are aware) whereby this figure could be 



translated to a reliability correlation coefficient, so it 
cannot be interpreted as a reliable coefficient normally would. 
However, it does provide a valid and quick estimate of the 
accuracy of rating data, which should be carefully considered 
prior to decision making based on rating outcomes* 



1^ 29/^ 

Training Program Evaluation Scales 
Instructions 

Hie scales on the following pages wero constructed in order to asseiss 
the ef fectiveriesjj of several critical aspects of the training program. 
Please complete the scales exactly as instructed below: 

1. Complete the scales by responding to each Descriptive Statement, 

one by one, in the order in which they are presented (1-27). 
2» The ratings are to be made in the following manner: consider each 
Descriptive Statement independently from the others and decide 
whether you think the training program being evaluated was worse 
than the Descriptive Statement; matched the statement; or was 
better than the Gtatement. 

If you think the program was worse than the statement, place 
a - mark in the corresponding box. 
— If you think the program matched the statement, place a 0 mark 

in the corresponding box. 
— - If you think the program was better than the statement, place 
a -f mark in the appropriate box. 

3. When you have finished responding to all ?7 statements in the above 
manner/ respond to the additional 5 statements listed on the last 
page by placing a check (^) mark in either the "y®s," "a little," 
or "no" hoxm 

4. Write any comments or suggestions you may have on the back of the 
paper. 



31 



, TRAINING PROGRAM EVALUATION SCALES 



Name 



Date 



Descriptive Statements 



!• Most training staff tend to view trainees as a group and 
rarely coordinate their training activities and resources 
to address th6 needs of an individual trainee. 

.2. Language trainers use some variety in teaching techniques 
but not sufficiently to consistently contribute to indi- 
vidual motivation for language learning. 

3. Cross-cultural training emphasizes a variety of alter- 
native behaviors that core appropriate to specific situ- 
ations, utilizing the "larger theoretical and cultural 
context for greater understanding • 

4. Training staff avoid feedback from trainees and rcirely 
deal with suggestions and criticisms in such a way as 
to make changes or find solutions. 

5* Learning activities in the coinmunity are occasionally 
scheduled; these activities are usually integrated into 
the total learning programr but sometimes suffer from 
lack of staff guidance • 

6. The Director is easily approachable/ and often partici- 
pates in informal group and individual discussions with 
trainees. 

7. Most training staff avoid spending extra time with indi- 
vidual trainees. 

8. The staff appear to have a divided team approach, some 
inability in resolving conflict, and fair success in 
building a learning climate of open interaction among 
themselves and with trainees. 



The majority of training staff competently apply appro- 
J)riate Peace Corps training methodology, showing kindness 
and consistency in the way they deal with trainees. 



10. Some training staff coordinate their training activities 
and resources to address the needs of an individual 
trainee. 

11. Language trainers use a great variety of teaching tech- 
niques that consistently contribute to individual moti- 
vation for leuiguage learning*. 




[Write any 
comments or 
suggestions 
you may 
have on 
the back 
of this 
paper.] 



32 



12. 



13. 



14. 



15. 



16. 



17, 



18. 



19. 



20. 



21. 



22. 



23. 



Descriptive Statements 

Cross-cultural training is restricted to prescriptions 
of stereotyped behavior that are appropriate or inappro- 
priate to the Brazilian culture. 

Training staff seek feedback from trainees and usually 
deal with suggestions and criticisms, but seldom take 
imrtiediate action to work out changes or solutions. 

Guided learning activities in the community are scheduled 
as often as possible, and ' core well integrated into the 
total learning program. 

The Director is difficult to approach and rarely peir- 
ticipates in informal group or individual discussions 
with trainees. 

Most training staff spend extra time talking with indi- 
vidual trainees, but only when trainees approach staff 
members. 

The staff appears to have a good team approach towcird 
conflict resolution and building a favorable learning 
climate of open interaction among themselves and with 
trainees. 

Only a few trainers competently apply appropriate Peace 
Corps training methodology, showing .kindness and con- 
sistency in the way they deal with trainees. 

Most training staff consistently coordinate their train- 
ing activities and resources to address the needs of an 
individual trainee. 

Language trainers use little Vciriety in teaching tech- 
niques, which contributes to loss of motivation for 
language le earning. 

Cross-cultural training includes some variety of alter- 
native behaviors that are appropriate to specific 
situations, and utilizes the historical context for 
greater understanding. 

Training staff seek feedback from trainees and always 
deal with suggestions and criticisms immediately to 
mutually find the best solution. 

IiBcirning activities in the community are rarely sched- 
uled. These activities suffer from lack of staff 
guidance and axe poorly integrated into the training 
progrcim. 



0» 



[write any 
comments or 
suggestions 
you may 
have on 
the back 
of this 
paper.] 



33 



Descriptive Statements 

24. The Director is sort of approachable, and occasionally 
participates in informal group and individual discussions 
with trainees. 

25. Most training staff seek extra time opportunities for 
talking with individual trainee s, 

26. The staff appear to lack a team approach, affect the 
whole center with their conflict, and/or segment the 
learning climate according to the philosophy of each 
trainer. 

27* About half of the training staff competently apply appro- 
priate Peace Corps training methodology, showing kindness 
and consistency in the way they deal with trainees. 



[Write any 
comments or 
suggestions 
you may 
have on 
the back 
of this 
paper.] 



Part II 



Yes 



A Little 



No 



□ 

□ 

□ 
□ 



□ 



The training program has realistic job- 
centered objectives relevant to the Peace 
Corps Volunteer program in Brazil. 

Job descriptions contain specific infor- 
mation and are up-dated for accuracy. 

Trainees' are offered adequate opportunities 
to sample off-training-center cultural 
activities during leisure tine- 

The learning climate is physically good, 
conducive to serious study. 

The training schedule is compatible with 
trainee energy levels and leisure time needs> 



ERIC 



34 



TRAINING PROGRM EVALUATION SCALES 
Scoring Keys 



Scoring Key 1. Training Staff Expertise in Applying Peace Corps Training 

Methodology 

(High) I = Item 9 
(Medium) II = Item 27 
(Low) III = Item 18 

Scoring Key 2. Training Staff Team Performance 

I = Item 17 
II = Item 8 
III = Item 26 

Scoring Key 3, Training Staff AvailalDility to Trainees 

I = Item. 25 
II = Item 16 
III = Item 7 

Scoring Key 4, Training Program Director Availability and Responsiveness 

I = Item 6 
II = Item 24 
III = Item 15 

Scoring Key 5. Experiential Learning Based on Host Community Environments 

I = Item 14 
II = Item ^ 
III = Item 23 

Scori/ig Key 6, Training Staff Responsiveness to Trainee Suggestions 

I = Item 22 
II = Item 13 
III = Item 4 

Scoring Key ?• Cross-Cultural Training Method 

I « Item 3 
II = Item 21 
III = Item 12 



ERIC 



Scoring Key 8, Language Training Method 

I = Item 11 
II = Item 2 
III = Item 20 

Scoring Key 9. Coordination of Resources for Individual Needs 

I = Item 19 
II = Item 10 
III = Item 1 



36 



TRAINING PROGRAM EVALUATION SCALES 

Scorinc? Matrix 

Descriptive Statement s Points 

I II III 

(High) (Medium) (Low) 



+ 



+ 
+ 



+ 

+ 

0 



+ + 7 



+ + 0 7 

+ + - 7 

0 + + 6 

0+0 6 

0 + - 6 

+ + 5 

- + 0 5 

- + - 5 
0 - + - 5 

0 + 4 

0 0 - 4 

+ 0 + 4 

+ 0 - 4 

0 0 0 4 

- - + 3 
0 - 3 
- + 3 



0 3 
0 2 



0-0 2 



+ = The program was better than the statement. 

0 = The statement matches the program. 

- = The program was worse than the statement. 



ERLC 



2 
1 
1 
1 



38 



TRAINING PROGRAM EVALUATION SCALES 

I 

c;ujninairy Sheet 



Training Program 
Date 



Scale 1. Training Staff Expertise in Applying 

Peace Corps Training Methodology Mean Score 

Scale 2. Training Staff Team Performance Mean Score 

Scale 3. Training Staff Availability to Trainees Mean Score 

Scale 4. Training Program Director Availability 
and Responsiveness 

Scale 5. Experiential Learning Based on Host 
Community Environments 

Scale 6. Training Staff Responsiveness to Trainee 

Suggestions Mean Score 

Scale 7. Crcss-Cultural Training Method Mean Score 

Scale 8. Language Training Method Mean Score 

Scale 9. Coordinator of Resources for Individual Needs Mean Score 



Mean Scores 



Mean Score 



Yes 



A Little 



Realistic, Job-Centered Objectives 

Job Descriptions 

Brazilian Cultural Activities 

Learning Climate 

Training Schedule 



No 



Summary of Comments and Suggestions 



39 



TRAINING PROGRAM EVALUATION SCALES 
Error Computation Work Sheet 



ERIC 



40 




ERIC 



41 




6. 



out 
of 
20 

% 



7. 



out 
of 
20 

% 



8. 



out 
of 
20 

% 



out 
of 
20 



9. 



% 



Rater Error Sums [ - 



out of 9 

out of 9 
out of 9 
out of 9 
out of 9 
out of 9 
out of 9 
out of 9 
out of 9 
out of 9 
out of 9 
out of 9 
out of 9 
out of 9 
out of 9 
out of 9 
out of 9 
out of 9 
out of 9 
out of 9 



Average Rater Error 

(^)= _outo.. 

Rater ^ / Aver. Sum \ _ ^ 
Error \ 9 / — 



ERIC 



42 

ERROR KEY 



I II III 

(High) (Medium) (Low) 

+ + + 

0 + + 

- + + 
0 + 

- - + 
- 0 



+ = The activity or program is better than the statement. 

0 = The^ statement matches the activity or program. 

- = The activity or procrair. is vrcrse than thG statemesnt^ . 

USE OF KEY: Any set of ratings other than the seven listed here is isi error. 



I 



TRAINING PROGRAM EVALUATICN SCALES 
Example of Error Computation 



45 




04-+ 






1 


-0+ 




-0 + 




0++ 




-0+ 






1 


-++ 




00- 


1 


-0+ 




-0+ 




-++ 




— 4- 




-++ 




-0+ 




++0 


1 


0-+ 


1 


-++ 




— + 




++- 


1 


— + 




0++ 






1 


0++ 




-++ 






1 


— + 




0++ 




0-+ 


1 


0++ 




— + 




++0 


1 


-++ 




-++ 






1 


0++ 




-n+ 




0++ 




— + 




+++ 




-0+ 




0++ 




-++ 




+++ 




0-+ 


1 


0++ 




-++ 




+++ 




-0+ 






1 






++- 


1 


— + 




-++ 




-0+ 




0++ 




— + 




-++ 




-0+ 




0++ 




-++ 




• 0++ 






1 


0++ 




-++ 




-++ 




-0+ 




++0 


■ 1 


— + 




+-+ 


1 


-0+ . 




+++ 




— -f 




— + 




-0+ 




++- 


1 


-++ 




— + 




-0+ 









4_ 


4__ 


4 


6 


out 


out 


out- 


out 


of . 


of 


of 


of 


20 


20 


20 


20 




7. 20^% 


8. m 


9. 30^% 



Rater Error Sums 



out of 9 

out of 9 
out of 9 
out of 9 
out of 9 
out of 9 
out of 9 
out of 9 

out of 9 

0 out of 9 
^ out of 9 
out of 9 
out of 9 
out of 9 
out of 9 
out of 9 
out of 9 
out of 9 
out of 9 



0 out of 9 



Average Rater Error 
J5 out of 9 



Rater ^ / Aver. Sum \ _ o 
Error V. 9 / ~— 



ERIC 



46 

The Training Activity Evaluation Scales 

The complete system for evaluat;Lng specific training activities appears 
at the end of this section, beginning on Page 49. The scales themselves 
should be ac3ministered exactly as they appear and as described in the in- 
structions. It is recommended that these scales be used at the end of the 
second week of training, and either each week or every other week thereafter 
through the conclusion of the program. The four or five major training 
activities conducted during the evaluation period (e.g., morning language 
classes, case study exercise, lecture on history) should be listed on the 
scales before being distributed for completion of ratings. It is important 
that the resulting eveiluation information be immediately shared among the 
training community for program modification and improvement, as well as the^" 
data being systematically analyzed and stored in a central location (Peace 
Corps/Brazil or Peace Corps/Washington). 

Scoring 

The fifteen rating statements are curranged in random order on the scales 
so that the rater ccinnot easily determine which statements belong to par- 
ticular categories or which statements fit in an order of effectiveness 
sequence. The statements must be rearranged in their proper sequence and 
category for scoring purposes. This is done by examining the Scoring Keys 
beginning on Page 52 and the Scoring Matrix on Page 57- The items belonging 
to each category according to the Key are examined on each response sheet 
and the scale value is found on the Scoring Matrix. Since the scales are 
designed in such a way that fiva separate training activities can be eval- 
uated on one page, scoring overlays have been prepared for use in scoring 
each of the criterion categories as they aore presented here. Each of the 



47 

five criterion categories can therefore be scored by placing the scoring key 

overlay for a particular category on a response sheet and noting the scores 

It 

for that category across each of the activities rated. (The sample keys 
shown here can be overlaid on each of the two sheets used for the scales.) 

For example, for Criterion Category 1 - Clarity of Objective - items 
#10, #5 and >|^15 appear in the scoring overlay and are examined in that 
secjuence. If a given response set to these items for Activity A is #10 = +r 
#5 = +, and #15 = +, the resulting score is 7. If another response set to 
these items, on Activity B, is #10 = #5 = and #15 = the resulting 
score is 3, and so on for all possible response combinations as indicated in 
the Scoring Matrix for each overlay across activities being rated* 

Each resulting scale value should be listed for all respondents on the 
Scoring Work Sheets beginning on Page 53» When all of the five scales have 
been scored for all of the respondents across all a:2tivities being evaluated, 
the scauLe sums and scale means for each activity should be computed as indi- 
cated on the bottom of the Work Sheets. Each of thiise five scale means for 
each of the activities being evaluated should then be listed on the Summary 
Sheet on Page 63« Copies of the Summary Sheet shoulri be distributed to all 
staff and trainees for feedback and discussion. This Summary Sheet should 
also be used for purposes of program evaluation record-keeping. 

As described in the previous section for the Training Program EJvaluation 
Scales, each time these Activity Scales are used, it is possible to compute 
the rater errors as well as the individual scale errors. In this case, 
however/ errors can also be ccmputed for each of the activities being eralu-* 
ated at any given time. The error computation is done by using the Error 
Computation Work Sheet on Page 64. Only one Work Sheet has been prepared, 

ERLC 



48 

but additional sheets exactly like this one should be prepared for each 
activity that has been evaluated. The responses on each of the five scales 
are listed using the Scoring Keys and the Scoring Matrix. Then the errors 
are listed for each set of ratings and each scale by referring to the Error 
Key presented on Page 65 • Any set of ratings other than the seven appearing 
in the Error Key are in error according to the logic of the system. Each 
error is noted and listed in the appropriate space on the Work Sheet. Tlie 
errors for each rater are computed by summing the errors made by each rater 
across the five scales. The average rater error is computed by summing each 
rater error and dividing by the number of raters. A rater error percentage 
can be computed for eacU activity being evaluated by dividing the average 
rater error by five. 

The errors made in using each of the five scales can be computed by 
summing the total niomber of errors made by the respondents on each scale. 
A scale error percentage can be computed by dividing the sum of each scale 
error by the number of respondents. An example of how this error computation 
system works for these Training Activity Evaluation Scales is not presented. 
However/ the example for the Training Program Evaluation Scales appearing 
on Page 43 is sufficient for these purposes. Referring to that example, the 
error information for these scales ceai be interpreted and used in the same 
ways described on Page 27. That is, each of the five scale reliabilities 
can be determined. The errors made by each rater can be determined, and the 
scale data made more reliable by not using the ratings made by the most 
unreliable raters. The overa]!l rater error, or conversely, the overall rater 
accuracy, can be determined for each of the training activities evaluated, 
which provides a general idea of the accuracy of the- evaluation system* 



49 

Training Activity Evaluation Scales 
Instructions 

The scales on the following page were constructed in order to assess the 
effectiveness of training activities. Please complete the scales exactly as 
instxucted below: 

1. Rate each activity one at a time by responding to all fifteen 

descriptive statements for the first, act.-'.^ity, then for the second, 

third, fourth, and fifth. 
2m The ratings are to be made in the following manner: consider each 

Descriptive Statement independently of the others, one at a tiraefin 

the order listed (1-15) » Decide whether you think the tradjiing 

activity being evaluated v:a?: worse than the I>i£criptivc Statement; 

the activity matched the statement; or the activity was better than 

the statement • 

If you think the activity was worse than the statement, place 
a - mark in the appropriate box. 

— If you think the activity matched the statement, place a 0 mcirk 
in the appropriate box. 

— If you think the activity was better than the statement, place 
a + mark in th*? appropriate box. 

3. Write any comments or suggestions you may have on the back of the 
pa]>er. 



ERIC 



50 



TRAINING ACTIVITY EVALUATION SCALES 



Name 
Date 

1. 



4. 



5. 



7. 



Descriptive Statements 

I learned very little from this activity 
and I don't think it has helped me pre- 
pare for Volunteer service. 

The materials used in this activity were 
adequate and seemed moderately relevant 
to the purpose of the activity. 

The method used in implementing this ac- 
tivity was very effective in facilitating 
my learning- 

The trainer was not skillful in conduct- 
ing this activity, and did not help me 
learn. 

I think I understood the objective of 
this activity, but it was not very clear. 

I learned a great deal from this activity 
which I feel has helped me prepare for 
Volunteer service. 

The materials used in this activity were 
not well prepared and seemed irrelevant 
to the purpose of the activity. 



Act 
B 



-les 



[Write any 
comments or 
suggestions 
you may 
have on 
the back 
of this 
paper •] 



8. The method used in implementing this ac- 
tivity was all right, but it did not par- 
ticulcirly facilitate my learning. 

c 

9. The trainer conducted this activity fairly 
well, but could have used more skill in 
helping me learn. 

10. I clearly understood the objective of this 
activity. 

11. I learned a moderate amount from this ac- 
tivity and some of what I learned has 
helped me prepare for Volunteer service. 

12* The materials used in this activity were 
well prepared and very relevant to the . 
purpose of this activity. 



51 



Descriptive Statements 

13 • The method used in implementing this ac- 
tivity did not facilitate my learning. 

14. The trainer was very skillful in conduct 
ing this activity. The effective use of 
these skills greatly facilitated my 
learning. 

15. I did not understand the objective of 
this activity. 



Activities 
A B C D E 

' ^ " [Write any 

comments or 

suggestions 

you may 
have on 
the back 
of this 

paper •] 



ERIC 



52 



TRAINING ACTIVITY EVALUATION SCALES 
Scoring Keys 
Scoring Key 1. Clarity of Objective 

A B C D 



15 • I did not iinder stand the objective 
of this activity. 



5. I think I understood the objective of 
this activity, but it's not too clear • 



10. I clearly understood the objective of 
this activity. 























Cut Out 












Cut Out 










































Cut Out 























III 



o 

ERIC 



53 



Scoring Key 2. Skill of the Trainer 



B 



14. The trainer was very skillful in con- 
ducting this actit^'ity. The efi'ective 
use of these skills greatly facilitated 
my learning. 



4. The trainer was not skillful in r;on- 
ducting this activity f and did not 
help me learn. 



The t^rainer v^onducted this activity 
farily well, but could have used more 
skill in helping me learn. 













Cut Out 












Cut Out 










































Gut Out 

































-III 



II 



54 



Scoring Key 3. Effectiveness of Method 



A B C D E 



ERIC 



13. The method used in implementing this 

activity did not facilitate my learning 



3. The method used in implementing this 
activity was very effective in facili- 
tating my learning. 



8. The method used in implementing this 
activity was all rights but it did not 
particularly facilitate my lecirning. 



Cut Out 












Cut Out 










































Cut Out 











































III 



II 



55 



Scoring Key 4. Quality of Materials 



adequate and seemed moderately relevant 
to the purpose of the activity. 



The materials used in this activity 
were not well prepared and seemed irrel- 
evant to the purpose of the activity. 



12. The materials used in this activity 

were well propared and very relevant to 
the purpose of this activity. 













Cut Out 










































Cut Out 










































Cut Out 



II 



III 



ERIC 



56 



Scoring Key 5. Learning Achieved 



B 



1. 



I learned very little from this activity 
and I don't think it has helped^ me pre- 
pare for Volunteer service • 



6. 



I learned a great deal from this activ- 
ity which I feel has helped me prepare 
for Volunteer service. 



11. 



I lecirned a moderate amount from this 
activity and some of what I learned has 
helped me prepare for Volunteer seirvice. 



Cut Out 










































Cut Out 










































Cut Out 













III 



II 



ERIC 



57 

TRAINING ACTIVI1Y EVALUATION SCALES 

Scoring Matrix 

Descriptive Statements Points 

I II III 

(High) (Mediiam) (Low) 



+ 


+ 


+ 


7 


+ 




0 


7 


+ 


• + 




7 


0 


+ 


+ 


6 


0 


+ 


0 


6 


0 


+ 




6 




+ 


+ 


5 






0 


5 




+ 




5 


0 




+ 


5 




0 


+ 


4 


0 


0 




4 


+ 


0 


+ 


4 


+ 


0 




4 




0 


0 


4 








3 




0 




.3 






+ 


3 




0 


0 


3 






0 


2 


0 




0 


2 


+ 




0 


2 








1 


+ 






1 


0 






1 



The activity is bettor than the statement. 

The statement matches the activity. 

The activity is worse than the statement. 



TRAINING ACTIV.ITY EVALUATION SCALES 
Scoring Work Shoots 
Activity A 



Scales - 
Trainees 



1, 

Clarity of 
Objective 



\ 



1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 

Sums 
Means / Sum \ 



2. 3. 

Skill of Effectiveness 
Trainer of Method 



4. 5. 

Quality of Learning 
Material Achieved 



/Sur 



J 



1. 



2. 
2. 



3, 
3. 



4. 
4. 



5. 



Write these values dovm column A on the summary sheet. 



59 



Activity B 



Scales 



Trainees 



1. 

Clarity of 
Obi active 



Means 



1 
2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
Stuns 

V N ) 



1. 
1. 



2. 3. 4. 5. 

Skill of Effectiveness Quality of Learning 
Trainer of Method Material Achieved 



2. 
2. 

I 



3. 
3. 



4. 
4. 



5. 
5. 



WritQ these values down column A on the summary sheet. 



Activity C 



1. 2. 3. 4. 5. 

Scales > Clarity of Skill of Effectiveness .Quality of Learning 



Trainees 



1 



Objective Trainer of Method Material Achieved 

1 



3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 



Sums ^1. 2. 3. 4. 5. 

Means /Suin\ 1. 2. 3, 4. 5. 

KiTj i i I ; i 

Write these values down column A on the summary sheet. 



61 



Activity D 



1. 2. 3. 4. 5. 

Scales ^ Clarity of Skill of Effectiveness Quedity of Learning 

Objective Trainer of Method Material Achieved 

Trainees 



2 
3 



4 

5 
6 
7 



8 



9 



10 



11 



12 
13 
14 
15 
16 
17 



18 
19 

20 ; 

Svuns 1. 2. 3. 4.' 5, 

Means ( Sum\ 1. '2. 3. 4. 5, 

\—) i , i I i i 

Write these values dovm column A on the summary sheet. 



ERIC 



62 



Scales > 

Trainees 

1 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 

Sums 
Means ^ Sum j 



Activity E 

1, 2. 3. 4. 5. 

Clarity of Skill of Effectiveness Quality of Learning 

Objective Trainer of Method Material Achieved 



1. 2. V 3. 4. 5. 

1. 2. 3. 4. 5. 

I i 1 ; i 

Write these values dovm column A on the summary sheet. 



ERIC 



TRAINING ACTIVITY EVALUATION SCALES 
Summary Sheet 

Training Program 

Week Evaluated 

Date 

Activity Activity Activity Activity Activity 
A B C D E 

Scale 1. 
Clcirity of Objective 

Scale 2. 
Skill of Trainer 

Scale 3, 
Effectiveness of Method 

Scale 4, 
Quality of Materials 

Scale 5, 
Learning Achieved 

Summary of Comments and Suggestions 



64 



> :3 

•rl H 

1< I§ 



ERIC 



4J 
0) 
O 

cn 
o 
o 

1^ 

O 
O 

o 

VI 



r 

, VI 

o 
u 
u 

w w 
VI 3 

4J 



in inininintniou^ininiOLnioiriiotriintninLn 

O 000000000 0000000000 

o ooooooooooooooooooo 



1 1 



u 


tn 








0 

u 

Q) U 


of 










-P 








n3 


Id 




1 




VI Vi 


0 




11 






O Q) 










> V 


1 














U 


VI 




in 




Q) 
4J 


ro 


er 










VI 












w 







Rater 
1 Errors 




5. 

Learning 
Achieved 




Rater 
Errors 




4. 

Quality of 
Materials 


tn f 
u U i 

0) 0 ; 

4J VI 

tTJ VI ; • 


Effectiveness 
of Method j 




Rater ' 
Errors 




Skill of 
Trainer 




Rater 
Errors 




1. 

Clarity of 
Objective 





4J 

a M-l O I 
O O CM 



4J 
O 



m o 

O On| 



dp 



4J <*P, 

Id m o 

O O CM 



4J 

a M-l O 
O O CM 



CM 



4J 
O 



M-l O 
O CM 



69, 



o 



HcMn^inujr^oocTiOHfMfo 

H r4 H H 



10 o r** 

H iH H H 



00 cr» o 

H H CM 



cn 

u 
o 

Q) 

O 
01 



P o 
W CM 



VI 

o 
u 
u 
w 



Q) 

-A 



ERROR KEY 



I 

(High) 



II 

(Medium) 



III 

(Low) 



+ 
+ 
+ 
0 



+ 
+ 
+ 
+ 
+ 
0 



+ = The activity or program is better than the statement, 

0 = The statement matches the activity or program. 

- = The activity or program is worse_than the statement. 



USE OF KEY: Any set of ratings other than the seven listed here is in error- 



IV. GOTCLUSION 

The two evaluation scaling systems produced during this project repre- 
sent significant improvements over conventional methods. Their systematic 
use in evaluating Peace Corps training would result in much more accurate 
and reliable information than the methods now being employed. The weaknesses 
in the present system that these two methods will overcome, as well as other 
benefits of the new scaling procedures, have been discussed in previous 
sections. The new methods do have drawbacks, however, two of which are 
listed below along with recommendations for modification: 

o The nine criterion categories, along with the twenty-seven 
degree-of-ef fectiveness statements in the Training Program 
Evaluation Scales, represent what the participants (former 
trainees) in this study thought were the most important aspects 
of training. These may not be adequate for particular puiposes 
of evaluation inquiry, or for other training programs. It is 
recommended that where these categories seem inappropriate, new 
cmd more relevant ones be developed according to the Retrans- 
lation procedure. 

o Compsired with conventional rating scales, completing these scales 
is* rather a laborious task. (For the Training Program Evaluation 
Scales, the rater must make thirty- two separate judgments; for 
the Training Activity Evaluation Scales, the rater must make 
seventy-five different judgments in assessing five different 
training activities.) Furthermore, after the statements have 
been checked by the rater, the scoring is somewhat time consuming. 

The extra effort that this complexity and sophistication repre- 
sents seems worthwhile in light of the consequences pf decisions 
mctde bascr.d on evaluation data. More accurate and reliable data 
will require more time and riffort. However, where this system 



68 

seems too complicated, a modification can be made that greatly 
simplifies the system, although the accuracy and reliability of 
the data suffers, This modification involves eliminating the 
use of the Mixed Standard Scaling method and replacing it with 
a simple three- or five-point continuous scale for each criterion 
category • For example, the three^ statements describing high, 
medium, and low effectiveness for the "Quality of Training 
Materials" category would be arranged along a five-point scale. 
The rater would make one judgment (instead of three) for this 
rating by selecting a number from one to five. 

In conclusion, it is recommended that these new scaling systems be put 

to use in evaluating Peace Corps training programs, and that data be collected 

on a systematic basis in order to render the systems maximally useful. The 

rater, scale, and activity reliabilities should be computed and recorded, so 

that decisions can be made with known degrees of confidence and so that the 

scales themselves can be modified for greater usefulness, accuracy, and 

reliability. 



69 



REFERENCES 



Barrett, R. S., Performance Rating « Chicago: Science Research Associationr 
Inc.r 1966. 

Bass, , "Further Evidence on the Dynamic Character of Criteria," 

Personnel Psychology , Vol. 15, 1962. 

Beatty, Walcott H. , (Chairman and Editor), Improving Educational Assessment 
and an Inventory of Measures of Affective Behavior . Association for 
Supervision and Curriculum Development, NEA, Washington, D. C, 1969. 

Bendr Emil, ''The Impact of the Social Setting upon Evaluative Research," 
Evaluative Strategies and Methods , AmericEin Institutes of Research, 1970. 

Blanz, Friedrich and Edwin E. Ghiselli, "The Mixed Standard Scale: A New 
Rating System," Personnel Psychology , Vol. 25, No. 2, 1972. 

Bloom, Benjamin S., J. Thomas Hasting, and George F. Madaus, Handbook on 
Formative and Summative Evaluation of Student Learning . New York: 
McGraw-Hill, 1971. 

Campbell, Donald T. , and Julian C. Stanley, "EKperimental and Quasi- 
Experimental Designs for Research in Teaching," in N. L. Gage (Ed.), 
Handbook of Research on Teaching . Chicago: Rand McNally, 1963. 

Carver, Ronald P., "Special Problems in Measuring Change," Evaluative 
Strategies and Methods , American Institutes of Research, 1970. 

Center for Research and Education Training Directors, Final Reports ^ Brazil 

Peace Corps Training Programs y 1972-1973 . Center for Research and Education 
Denver, Colorado (Peace Corps Contract PC-72-42043) . 

Edwards, A. L.,, Techniques of Attitude Scale Construction . New York: 
Appleton-Century Crofts, 1957. 

Fitzpafcrick, Robert, "The Selection of Measures for Evaluating Programs," 
E^/aluative Strategies and Methods , American Institutes of Research, 1970. 

Garner, W, R. , "Rating Scales, Discriminability , and-Information Trcinsmission, " 
Psychological Review , Vol. 67, I960. 

Glass, Gene V., The Growth of Evaluation Methodology , Research Paper No. 27, 
Laboratory of Educational Research,. University of Colorado, March 1969. 

Glennan, Jr., Thomas K., Evaluating Federal McinpowGr Programs: Notes and 
Observations . Rand Corporation, Santa Monica, California, September 1969. 



70 



Gold/ Norman, "An Illustration: Evaluating a Complex Social Program," 
Evaluative Strategies eind Methods , Americaai Institutes of Reseairch, 
1970. 

Guion, R. M. , "Criterion Measurement and Personnel Judgments," Personnel 
Psychology , Vol. 14,-1961. 

Hawkridge., David G., "Designs for Evaluative Studies," Evaluative Strategies 
and Methods , American Institutes of Resecirch, 1970. 

Johnson, George H. , "The Purpose of Evaluation and the Role of the Evaluator," 
Evaluative Strategies and Methods , American Institutes of Research, 1970. 

Kaufman, Roger A., Educational System Planning . Englewood Cliffs: Prentice- 
Hall, Inc., 1972 

Mcirgolis, Frederic , and Steve Gillespie, Programming for Volunteer Service 
{developed for ACTION). F. M. Associates Ltd., Rockville, Map::;^land. 

Osgood, Charles E. , "Exploration in Semantic Fpace: A Personal Diary," 
Journal of Social Issues , Vol. 27., No, 4, 1971. 

Prien, E. P., "Dynamic Chciracter of Criteria: Organization Change , " Journal 
of Applied Psychology , Vol. 50, 1966. 

Sch'jlts,^,D. G, and A. I- Siegel, Post- Training PcrfoiirtCiriCG Criterion 

Development and Applications: A Selective Review of Methods for Measuring 
Individual Differences in On-the-job- Performatnce . Applied Psychology 
Service, Wayne, Pennsylvania, 1961. 

Scrivens, Michael, "Methodology of Evaluation," Perspectives of Curriculum 
Evaluation , Vol. 1, New York: Rand McNally and Company, 1967* 

Smith, Patricia Cain, and L. Kendall, "Retranslation of Expectations: 

TVn Approach to the Construction of Unambiguous Anchors for Rating Scales," 
Journal of .Applied Psychology , Vol. 47, No. 2, 1963. 

Stake, Robert E. , "Language, Rationality, and Assessment," Improving Educational 
Assessment and An Inventory of Measures of Affective Behavior , Association 
for Supervision and Curriculum Development, NEA, 1969. 

Stevick, Earl, Memory, Meaning and Method: Some Psychological Perspectives 
for Language Teachers . Foreign Service Institute, Depcirtment of State, 
Washington, D. C. (In draft form, not yet published, 1973.) 

Stufflebeam, Daniel L. , "Evaluation as Enlightenment for Decjision-Making, " 
Improving Educational Assessment and An Inventory of Measures of Affective 
Behavior , Association for Supervision and Curriculum Development, bTEA, 1969. 

Tatsuoka, Maurice M. , Nationwide Evaluation and Exjperimental Design , University 
of Illinois at Urbana-Ch£impaign, 1972. Paper prepared for the 1972 Annual 
Meeting of the American Educational Resecirch Association. 



71 



Torgersorir Warren S. , Theory- ana Methods of Scalin g. New York: John Wiley 
& Sop.3r Trie, 1958. 

Tyl©^' Ralph W. , "The Porposes of Assessment," Improving Educational Assess- 
m<^nt and An Inventory v">f Measures of Affective Behavior , Association for 
Supervision and Ciirriculum Development, ilEA, 1969. 

Wa,'Jl:.ce, 3, R, , "Criteria for ^Jhat?" American Psychologist , Vol. 20, 1965, 

vfight, Albert R, , William L. Viight, and Mary Ajnne Hammons, Guidelines for 
Peace Corps Cross-Cultural Training , Ports I, II, III, IV. Center for 
Research and Education, Denvei:, Colorado, March 1970 (Peace Corps Contract 
PC -25-^1710}. 



ERLC 



